Analyzing Population and Land Cover Changes in NY Appalachia (2010-2019)
GEO511
Author
Stephen C. Sanders
Published
December 4, 2024
Introduction
The Southern Tier of New York consists of fourteen counties that are considered part of Northern Appalachia, according to the Appalachian Regional Commission. These counties have experienced some of the most drastic population declines in New York State outside of New York City (McMahon, 2024). On top of the decline in population, some areas in most of the Southern Tier suffer from relatively high poverty and low income levels compared to the rest of the United States as a whole (ARC, 2024 County-Level Distressed Area Study).
Map of three subregions in Appalachian NY (NY Department of State)
Some areas in Appalachia - namely Eastern Kentucky - have undergone extensive land reclamation efforts in an attempt to repair land that had been damaged during the era of intense coal mining operations, converting much of the reclaimed land into infrastructure, industry, and other development despite consistent population decline (KC, Gyawali, Lucas, et al., 2024). Coal mining was not a large industry in this area at any time, and much of the manufacturing was limited to cities such as Jamestown and Binghampton. While the need of a study concerning land cover for the purposes of land reclamation may not be necessary for the Southern Tier, an analysis of the region could lead to interesting insights about how the population decline may have impacted land cover changes. The purpose of this analysis is to get a general idea of the changes in population/economic variables and land cover in New York’s Southern Tier between 2010 and 2019 and how they may relate to each other. Specifically, are counties with a higher change in distressed population more likely to have had an increase in natural barren land and/or a decrease in developed area?
Materials and Methods
[~ 200 words]
Narrative: Clear narrative description of the data sources and methods. Includes data from at least two sources that were integrated / merged in R.
Code: The code associated with the project is well organized and easy to follow. Demonstrates mastery of R graphics and functions.
Data: The underlying data are publicly accessible via the web and downloaded/accessed within the Rmd script. If you want to use your own data, you must make it available on a website (e.g. Figshare) so that others are able to re-run your code.
You can do numbers like this:
The first most important thing
The second most important thing
The third most important thing
Data Gathering and Processing
Code
# load necessary librarieslibrary(tidyverse)library(tidycensus)library(sf)library(tigris)library(basemaps)library(leaflet)library(mapview)library(terra)library(tidyterra)library(gt)library(heatmaply)library(stringr)library(RColorBrewer)library(gridExtra)library(reshape2)library(here)library(knitr)knitr::opts_chunk$set(echo=TRUE) # cache the results for quick compiling# pull in census api tokencensus_token =Sys.getenv("CENSUS_TOKEN")census_api_key(census_token)
Map of Study Area
Add some text here
Code
# increase timeout time to download files and cache tigris filesoptions(timeout =500, tigris_use_cache =TRUE)# set vector of southern tier countiessouthern_tier_counties =c('Allegany', 'Broome', 'Cattaraugus', 'Chautauqua', 'Chemung', 'Chenango', 'Cortland', 'Delaware', 'Otsego', 'Schoharie', 'Schuyler', 'Steuben', 'Tioga', 'Tompkins')# get study geographies and basemap focused on ny stateny_state <-states(cb =TRUE, year =2019) %>%filter(NAME =='New York')study_counties <-counties(state ='NY', cb =TRUE, year =2019) %>%filter(NAME %in% southern_tier_counties)study_tracts <-tracts(state ='NY', county = southern_tier_counties, cb =TRUE, year =2019)base_ny <-basemap_raster(ext = ny_state, map_service ='carto', map_type ='light')
Loading basemap 'light' from map service 'carto'...
Code
# create study area mapstudy_area_map <-leaflet() %>%addTiles() %>%addPolygons(data =st_transform(study_counties, crs ='+proj=longlat +datum=WGS84'),fillOpacity =0, label =~study_counties$NAME) %>%addPolygons(data =st_transform(study_tracts, crs ='+proj=longlat +datum=WGS84'),fillOpacity =0, color ='black', weight ='0.75')study_area_map
Download and Process All Required Data
Add some text here
ACS Data
Add some text here
Set function to download ACS data based on geography
# get certain decennial 2010 census data at state level# then sum population and add additional data points taken from governmental reportsus_data.2010<-get_decennial(geography ='state',variables =c(tot_pop ='P001001'),year =2010,output ='wide',cache_table =TRUE,geometry =TRUE) %>%mutate(NAME ='United States') %>%group_by(NAME) %>%summarize(tot_popE =sum(tot_pop)) %>%mutate(tot_hhsE =116716292,median_ageE =37.2,pct_below_povertyE =15.3,median_incomeE =64400 ) %>%relocate(geometry, .after =last_col()) %>%as_tibble()
Getting data from the 2010 decennial Census
Using Census Summary File 1
Code
# pull acs data at national levelus_data.2014_2019 <-map2(2014:2019, rep('us', times =6), get_acs_data) %>%bind_rows() %>%as_tibble()
Getting data from the 2010-2014 5-year ACS
Getting data from the 2011-2015 5-year ACS
Getting data from the 2012-2016 5-year ACS
Getting data from the 2013-2017 5-year ACS
Getting data from the 2014-2018 5-year ACS
Getting data from the 2015-2019 5-year ACS
Code
# separate 2014-2019 us data into 2015 and 2019 data framesus_data.2015<- us_data.2014_2019 %>%filter(year ==2015)us_data.2019<- us_data.2014_2019 %>%filter(year ==2019)# merge us data for 2010, 2015, and 2019 into a single data frameus_data.2010_2019 <-left_join(us_data.2010, us_data.2015, by ='NAME', keep =FALSE, suffix =c('', '_2015')) %>%left_join(us_data.2019, by ='NAME', keep =TRUE, suffix =c('_2010', '_2019')) %>%select(!all_of(grep('M_', names(.), value =TRUE))) %>%# remove margin of error columnsrename(GEOID = GEOID_2010,NAME = NAME_2010,geometry = geometry_2010,year = year_2010,pct_below_povertyE_2010 = pct_below_povertyE,tot_below_povertyE_2015 = tot_below_povertyE_2010,hhs_no_vehicleE_2015 = hhs_no_vehicleE_2010 ) %>%relocate(GEOID, .after = NAME) %>%select(-c('GEOID_2019', 'NAME_2019', 'geometry_2015', 'geometry_2019', 'year', 'year_2019')) %>%mutate(pct_below_poverty_2015 = (tot_below_povertyE_2015 / tot_popE_2015) *100,pct_below_poverty_2019 = (tot_below_povertyE_2019 / tot_popE_2019) *100,pct_hhs_no_vehicle_2015 = (hhs_no_vehicleE_2015 / tot_hhsE_2015) *100,pct_hhs_no_vehicle_2019 = (hhs_no_vehicleE_2019 / tot_hhsE_2019) *100,pct_tot_pop_chg_2010_2019 = ((tot_popE_2019 - tot_popE_2010) / tot_popE_2010) *100,pct_median_age_chg_2015_2019 = ((median_ageE_2015 - median_ageE_2010) / median_ageE_2015) *100,pct_median_income_chg_2010_2019 = ((median_incomeE_2019 - median_incomeE_2010) / median_incomeE_2010) *100,pct_below_poverty_chg_2010_2019 = ((pct_below_poverty_2019 - pct_below_povertyE_2010) / pct_below_povertyE_2010) *100,pct_hhs_no_vehicle_chg_2015_2019 = ((pct_hhs_no_vehicle_2019 - pct_hhs_no_vehicle_2015) / pct_hhs_no_vehicle_2015) *100 ) %>%lapply(function(i) if(is.numeric(i)) ifelse(is.infinite(i), 0, i) else i) %>%as_tibble() %>%relocate(geometry, .after =last_col()) %>%st_as_sf(crs ="EPSG: 4269")
Census Tract Data
Add some text here
Code
# get census tract data for 2010-2019# isolate data for 2010, 2015, and 2019 and put into separate dfsst_tracts <-map2(2010:2019, rep('tract', times =10), get_acs_data) %>%bind_rows() %>%as_tibble()
# filter data for years 2010, 2015, and 2019, join with that year's economic data,# and store in separate data frames for each yearst_tracts.2010<-# health insurance coverage data not available for 2010 st_tracts %>%filter(year ==2010) %>%left_join(select(filter(st_tracts.economic, year ==2010), GEOID, pop_gt_16_in_civilian_labor_force_2010, pop_gt_16_in_civilian_labor_force_unemployed_2010 ) ) %>%relocate(geometry, .after =last_col())
# merge 2010, 2015, & 2019 data into single sf# compute pct chgs, convert to sf, and change infinite values to NAst_tracts.2010_2019 <-left_join(st_tracts.2010, st_tracts.2015, by ='GEOID', keep =FALSE, suffix =c('', '_2015')) %>%left_join(st_tracts.2019, by ='GEOID', keep =FALSE, suffix =c('_2010', '_2019')) %>%select(-c('NAME_2015', 'geometry_2015', 'year_2015', 'county_2015', 'NAME_2019', 'geometry_2019', 'year_2019', 'county_2019')) %>%rename(NAME = NAME_2010,geometry = geometry_2010,county = county_2010,year = year_2010 ) %>%mutate(across(starts_with('pop_gt_16'), ~as.numeric(as.character(.))),pct_pop_chg_2010_2019 = ((tot_popE_2019 - tot_popE_2010) / tot_popE_2010) *100,pct_poverty_chg_2010_2019 = ((pct_below_poverty_2019 - pct_below_poverty_2010) / pct_below_poverty_2010) *100,pct_hhs_no_vehicles_chg_2010_2019 = ((pct_hhs_no_vehicles_2019 - pct_hhs_no_vehicles_2010) / pct_hhs_no_vehicles_2010) *100,pct_pop_gt_16_in_civilian_labor_force_unemployed_2010 = pop_gt_16_in_civilian_labor_force_unemployed_2010 / pop_gt_16_in_civilian_labor_force_2010,pct_pop_gt_16_in_civilian_labor_force_unemployed_2015 = pop_gt_16_in_civilian_labor_force_unemployed_2015 / pop_gt_16_in_civilian_labor_force_2015,pct_pop_gt_16_in_civilian_labor_force_unemployed_2019 = pop_gt_16_in_civilian_labor_force_unemployed_2019 / pop_gt_16_in_civilian_labor_force_2019,pct_pop_gt_16_in_civilian_labor_force_unemployed_chg_2010_2019 = ((pct_pop_gt_16_in_civilian_labor_force_unemployed_2019 - pct_pop_gt_16_in_civilian_labor_force_unemployed_2010) / pct_pop_gt_16_in_civilian_labor_force_unemployed_2010) *100 ) %>%lapply(function(i) if(is.numeric(i)) ifelse(is.infinite(i), 0, i) else i) %>%as_tibble() %>%relocate(geometry, .after =last_col()) %>%st_as_sf(crs ="EPSG: 4269")######################################################################## REMOVED 36013990000 Census Tract 9900, Chautauqua County, New York## Had no estimated population AND had empty geometry######################################################################
County Data
Add some text here
Code
# pull southern tier county data and calculate area in square milesst_counties <-map2(2010:2019, rep('county', times =10), get_acs_data) %>%bind_rows() %>%as_tibble() %>%st_as_sf(crs ="EPSG: 4269") %>%mutate(area_sqmi =st_area(.) /2589988.110336) %>%relocate(geometry, .after =last_col())
Getting data from the 2006-2010 5-year ACS
Getting data from the 2007-2011 5-year ACS
Getting data from the 2008-2012 5-year ACS
Getting data from the 2009-2013 5-year ACS
Getting data from the 2010-2014 5-year ACS
Getting data from the 2011-2015 5-year ACS
Getting data from the 2012-2016 5-year ACS
Getting data from the 2013-2017 5-year ACS
Getting data from the 2014-2018 5-year ACS
Getting data from the 2015-2019 5-year ACS
Code
# separate st_counties for 2010, 2015, and 2019 into separate dataframesst_counties.2010<- st_counties %>%filter(year ==2010) %>%as_tibble()st_counties.2015<- st_counties %>%filter(year ==2015) %>%as_tibble()st_counties.2019<- st_counties %>%filter(year ==2019) %>%as_tibble()# merge 2010, 2015, & 2019 data into single sf# compute pct chgs, convert to sf, and change infinite values to NA# will merge with grouped st_tracts.2010_2019 data belowst_counties_base.2010_2019 <-left_join(st_counties.2010, st_counties.2015, by ='GEOID', keep =FALSE, suffix =c('', '_2015')) %>%left_join(st_counties.2019, by ='GEOID', keep =FALSE, suffix =c('_2010', '_2019')) %>%select(-c('NAME_2015', 'geometry_2015', 'year_2015', 'county_2015', 'NAME_2019', 'geometry_2019', 'year_2019', 'county_2019')) %>%rename(NAME = NAME_2010,geometry = geometry_2010,county = county_2010,year = year_2010 ) %>%mutate(pct_median_age_chg_2010_2019 = ((median_ageE_2019 - median_ageE_2010) / median_ageE_2010) *100,pct_median_income_chg_2010_2019 = ((median_incomeE_2019 - median_incomeE_2010) / median_incomeE_2010) *100 ) %>%lapply(function(i) if(is.numeric(i)) ifelse(is.infinite(i), 0, i) else i) %>%as_tibble() %>%relocate(geometry, .after =last_col()) %>%st_as_sf(crs ="EPSG: 4269")# get list of all pct columnspct_cols <-grep('pct', names(st_tracts.2010_2019), value =TRUE)# calculate data from st_tracts.2010_2019 by grouping by county, then merge with st_counties_base.2010_2019# create data frame of all county-level datast_counties.2010_2019 <- st_tracts.2010_2019 %>%select(!all_of(pct_cols)) %>%group_by(county) %>%reframe(tot_pop_2010 =sum(tot_popE_2010),tot_pop_2015 =sum(tot_popE_2015),tot_pop_2019 =sum(tot_popE_2019),tot_hhs_2010 =sum(tot_hhsE_2010),tot_hhs_2015 =sum(tot_hhsE_2015),tot_hhs_2019 =sum(tot_hhsE_2019),tot_below_poverty_2010 =sum(tot_below_povertyE_2010),tot_below_poverty_2015 =sum(tot_below_povertyE_2015),tot_below_poverty_2019 =sum(tot_below_povertyE_2019),hhs_no_vehicles_2010 =sum(hhs_no_vehicleE_2010),hhs_no_vehicles_2015 =sum(hhs_no_vehicleE_2015),hhs_no_vehicles_2019 =sum(hhs_no_vehicleE_2019),pop_gt_16_in_civ_lab_force_2010 =sum(pop_gt_16_in_civilian_labor_force_2010),pop_gt_16_in_civ_lab_force_2015 =sum(pop_gt_16_in_civilian_labor_force_2015),pop_gt_16_in_civ_lab_force_2019 =sum(pop_gt_16_in_civilian_labor_force_2019),pop_gt_16_in_civ_lab_force_unemployed_2010 = pop_gt_16_in_civilian_labor_force_unemployed_2010,pop_gt_16_in_civ_lab_force_unemployed_2015 = pop_gt_16_in_civilian_labor_force_unemployed_2015,pop_gt_16_in_civ_lab_force_unemployed_2019 = pop_gt_16_in_civilian_labor_force_unemployed_2019,pct_below_poverty_2010 = (tot_below_poverty_2010 / tot_pop_2010) *100,pct_below_poverty_2015 = (tot_below_poverty_2015 / tot_pop_2015) *100,pct_below_poverty_2019 = (tot_below_poverty_2019 / tot_pop_2019) *100,pct_hhs_no_vehicles_2010 = (hhs_no_vehicles_2010 / tot_hhs_2010) *100,pct_hhs_no_vehicles_2015 = (hhs_no_vehicles_2015 / tot_hhs_2015) *100,pct_hhs_no_vehicles_2019 = (hhs_no_vehicles_2019 / tot_hhs_2019) *100,pct_pop_gt_16_in_civ_lab_force_unemployed_2010 = (pop_gt_16_in_civ_lab_force_unemployed_2010 / pop_gt_16_in_civ_lab_force_2010) *100,pct_pop_gt_16_in_civ_lab_force_unemployed_2015 = (pop_gt_16_in_civ_lab_force_unemployed_2015 / pop_gt_16_in_civ_lab_force_2015) *100,pct_pop_gt_16_in_civ_lab_force_unemployed_2019 = (pop_gt_16_in_civ_lab_force_unemployed_2019 / pop_gt_16_in_civ_lab_force_2019) *100,pct_pop_chg_2010_2019 = ((tot_pop_2019 - tot_pop_2010) / tot_pop_2010) *100,geometry =st_union(geometry) ) %>%distinct(county, .keep_all =TRUE) %>%left_join(st_drop_geometry(st_counties_base.2010_2019), by =join_by(county == county), suffix =c('', ''), keep =TRUE) %>%st_as_sf(crs =st_crs(st_counties_base.2010_2019))
Land Cover Data
Add some text here
Land Cover Set Up
Add some text here
Code
# set urls to land cover imagesurl_2010 <-'https://edcintl.cr.usgs.gov/downloads/sciweb1/shared/lcmap/public/full_extent_downloads/version_13/primary-landcover_conus_year_data/LCMAP_CU_2010_V13_LCPRI/LCMAP_CU_2010_V13_LCPRI.tif'url_2019 <-'https://edcintl.cr.usgs.gov/downloads/sciweb1/shared/lcmap/public/full_extent_downloads/version_13/primary-landcover_conus_year_data/LCMAP_CU_2019_V13_LCPRI/LCMAP_CU_2019_V13_LCPRI.tif'# dissolve counties into single study are polygonst_full_study_area <- st_counties.2010_2019 %>%st_union()# define crop_raster function - crop land cover rasters to full study area# return both cropped and masked resultscrop_raster <-function(lc_file, study_area) { crp <-crop(lc_file, vect(st_transform(study_area, crs =st_crs(lc_file)))) msk <-mask(crp, vect(st_transform(study_area, crs =st_crs(lc_file))))return(msk)}# define calculate_zonal_stats function - calculate total area of each# land cover category in square miles over the entire study areacalculate_zonal_stats <-function(lc_file) { z <-zonal(cellSize(lc_file, unit ='km'), lc_file,fun=sum ) z <- z %>%mutate(area_sqmi = area *0.386102) %>%rename(land_cover_category =names(lc_file)[[1]]) %>%subset(select =-area)}# set land cover description listland_cover_type <-c(Developed =1,Cropland =2,`Grassland/Shrubland`=3,`Tree Cover`=4,Water =5,Wetlands =6,`Snow and Ice`=7,`Natural Barren`=8)# load three land cover rasters and crop each to study area polygon# find total area of each land cover categories in entire study arealand_cover_2010 <-rast(url_2010)st_lc_2010_masked <-crop_raster(land_cover_2010, st_full_study_area)zonal_stats_2010_study_area <-calculate_zonal_stats(st_lc_2010_masked)
# land cover changes from 2010 to 2019 in entire study arealand_cover_chg_study_area.2010_2019 <-left_join( zonal_stats_2010_study_area, zonal_stats_2019_study_area,by ='land_cover_category', suffix =c('_2010', '_2019')) %>%mutate(chg_in_area_sqmi = ((area_sqmi_2019 - area_sqmi_2010) / area_sqmi_2010) *100)### CREATE GT TABLE OF ABOVE LAND COVER CHANGE DF# convert land_cover_type vector into a dataframe to merge with land_cover_chg data frame# so land cover descriptions are includedlc.type <-data.frame(land_cover_category = land_cover_type, land_cover_desc =rownames(as.data.frame(land_cover_type)))rownames(lc.type) <-1:nrow(lc.type)# join lc.type df with land cover chg results df, then remove numbered category column# and replace all NAs with 0lc_chg.2010_2019 <-left_join(lc.type, land_cover_chg_study_area.2010_2019, by ='land_cover_category') %>%select(-land_cover_category) %>%replace(is.na(.), 0)# create table of land cover changelc_chg.2010_2019 %>%arrange(desc(area_sqmi_2010)) %>%gt(rowname_col ='land_cover_desc') %>%tab_header(title =md('**Land Cover Change in NY Southern Tier**')) %>%cols_label(area_sqmi_2010 =md('**Area (2010)**'),area_sqmi_2019 =md('**Area (2019)**'),chg_in_area_sqmi =md('**% Change**') ) %>%fmt_number(columns =c('area_sqmi_2010', 'area_sqmi_2019'), decimals =1) %>%fmt_percent(columns =c('chg_in_area_sqmi'), decimals =2, scale_values =FALSE)
Land Cover Change in NY Southern Tier
Area (2010)
Area (2019)
% Change
Tree Cover
7,387.1
7,386.8
−0.00%
Cropland
3,416.5
3,417.0
0.02%
Developed
387.2
387.1
−0.02%
Wetlands
378.5
378.4
−0.03%
Water
141.6
141.2
−0.27%
Grassland/Shrubland
109.6
106.5
−2.81%
Natural Barren
25.6
29.0
13.23%
Snow and Ice
0.0
0.0
0.00%
Add some text
Results
[~200 words]
Tables and figures (maps and other graphics) are carefully planned to convey the results of your analysis. Intense exploration and evidence of many trials and failures. The author looked at the data in many different ways before coming to the final presentation of the data.
Show tables, plots, etc. and describe them.
Distressed Area Analysis
Add some text here
Classification criteria comes from: https://www.arc.gov/distressed-areas-classification-system/
Code
# get list of columns that are involved in distressed area classificationdistressed_cols <-c(grep('tot_popE', names(st_tracts.2010_2019), value =TRUE),grep('median_income', names(st_tracts.2010_2019), value =TRUE), grep('pct_below_poverty', names(st_tracts.2010_2019), value =TRUE))# determined distressed area classification of each census tractst_tracts.distressed <- st_tracts.2010_2019[,c('GEOID', 'NAME', 'county', distressed_cols)] %>%mutate(if_Distressed_2010 = (((median_incomeE_2010 / us_data.2010_2019$median_incomeE_2010) <=0.67) & ((pct_below_poverty_2010 / ((us_data.2010_2019$pct_below_povertyE_2010) /100)) >=1.50)),if_Distressed_2015 = (((median_incomeE_2015 / us_data.2010_2019$median_incomeE_2015) <=0.67) & ((pct_below_poverty_2015 / ((us_data.2010_2019$pct_below_poverty_2015) /100)) >=1.50)),if_Distressed_2019 = (((median_incomeE_2019 / us_data.2010_2019$median_incomeE_2019) <=0.67) & ((pct_below_poverty_2019 / ((us_data.2010_2019$pct_below_poverty_2019) /100)) >=1.50)),if_Became_Distressed_2010_2019 = (if_Distressed_2010 ==FALSE& if_Distressed_2019 ==TRUE),if_Distressed_Then_Improved_2010_2019 = (if_Distressed_2010 ==TRUE& if_Distressed_2019 ==FALSE) ) %>%relocate(geometry, .after =last_col())# create spatial table of distressed statuses for each census tractdistressed.tab <- st_tracts.distressed %>%select(GEOID, NAME, county, grep('Distressed', colnames(st_tracts.distressed), value =TRUE))
Add some text here
Code
# map which tracts became distressed between 2010 and 2019mapview( distressed.tab, zcol ='if_Became_Distressed_2010_2019',alpha.regions =0.5,popup ="NAME",layer.name =c("Became Distressed (2010 -> 2019)"))
Add some text here
Code
# map which tracts improved between 2010 and 2019mapview( distressed.tab, zcol ='if_Distressed_Then_Improved_2010_2019',alpha.regions =0.5,popup ="NAME",layer.name =c("Improved (2010 -> 2019)"))
Add some text here
Code
# create table of number of worsened and improved tracts by county, then displaydistressed_chgs <- distressed.tab %>%group_by(county) %>%summarize(across(grep('2010_2019', colnames(.), value =TRUE), \(x) sum(x, na.rm =TRUE) ),tot_tracts =n() ) %>%st_drop_geometry() %>%arrange(desc(if_Became_Distressed_2010_2019))distressed_chgs %>%gt() %>%cols_label(county =md('**County**'),if_Became_Distressed_2010_2019 =md('**Became Distressed**'),if_Distressed_Then_Improved_2010_2019 =md('**Distressed Then Improved**'),tot_tracts =md('**# Tracts**') )
County
Became Distressed
Distressed Then Improved
# Tracts
Broome
5
1
55
Chautauqua
4
1
35
Allegany
2
0
13
Cattaraugus
2
0
21
Cortland
2
0
12
Steuben
2
0
30
Chemung
1
0
22
Chenango
1
0
12
Delaware
1
0
14
Otsego
0
0
17
Schoharie
0
0
8
Schuyler
0
0
5
Tioga
0
0
10
Tompkins
0
0
23
County-Level Distressed and Land Cover Change Analysis
Add some text here
Analysis Set Up
Add some text here
Code
# define calculate_county_level_data function - calculate_county_level_data <-function(lc_file, year) {# initialize index to use in county_data_list index =1# initialize list to hold dataframe for each county county_data_list =vector('list', length =length(southern_tier_counties))# set column names for different land cover categories developed_col =as.symbol(paste('developed_area_sqmi_', as.character(year), sep ='')) natural_barren_col =as.symbol(paste('natural_barren_area_sqmi_', as.character(year), sep ='')) cropland_col =as.symbol(paste('cropland_area_sqmi_', as.character(year), sep ='')) tree_cover_col =as.symbol(paste('tree_cover_area_sqmi_', as.character(year), sep ='')) grassland_col =as.symbol(paste('grassland_area_sqmi_', as.character(year), sep ='')) wetland_col =as.symbol(paste('wetland_area_sqmi_', as.character(year), sep ='')) water_col =as.symbol(paste('water_area_sqmi_', as.character(year), sep =''))# set distressed column name for the year distressed_col_yr =as.symbol(paste('if_Distressed_', as.character(year), sep =''))# set name of tot_pop_distressed column with year appended to end tot_pop_distressed_col <-as.symbol(paste('tot_pop_distressed_', as.character(year), sep =''))# set total population column to use in calculations tot_pop_est_col =as.symbol(paste('tot_popE_', as.character(year), sep ='')) tot_pop_col =as.symbol(paste('tot_pop_', as.character(year), sep ='')) tot_hhs_col =as.symbol(paste('tot_hhs_', as.character(year), sep ='')) tot_below_poverty_col =as.symbol(paste('tot_below_poverty_', as.character(year), sep ='')) pop_civ_lab_force_col =as.symbol(paste('pop_gt_16_in_civ_lab_force_', as.character(year), sep ='')) pop_unemployed_col =as.symbol(paste('pop_gt_16_in_civ_lab_force_unemployed_', as.character(year), sep ='')) pop_distressed_col =as.symbol(paste('tot_pop_distressed_', as.character(year), sep ='')) hhs_no_vehicles_col =as.symbol(paste('hhs_no_vehicles_', as.character(year), sep ='')) area_col =as.symbol(paste('area_sqmi_', as.character(year), sep =''))# set names of new columns to be calculated pct_below_poverty_col =as.symbol(paste('pct_below_poverty_', as.character(year), sep ='')) pct_unemployed_col =as.symbol(paste('pct_unemployed_', as.character(year), sep ='')) pct_pop_distressed_col =as.symbol(paste('pct_pop_distressed_', as.character(year), sep ='')) pct_hhs_no_vehicles_col =as.symbol(paste('pct_hhs_no_vehicles_', as.character(year), sep ='')) pct_developed_col =as.symbol(paste('pct_developed_area_sqmi_', as.character(year), sep ='')) pct_natural_barren_col =as.symbol(paste('pct_natural_barren_area_sqmi_', as.character(year), sep ='')) pct_cropland_col =as.symbol(paste('pct_cropland_area_sqmi_', as.character(year), sep ='')) pct_tree_cover_col =as.symbol(paste('pct_tree_cover_area_sqmi_', as.character(year), sep ='')) pct_grassland_col =as.symbol(paste('pct_grassland_area_sqmi_', as.character(year), sep ='')) pct_wetland_col =as.symbol(paste('pct_wetland_area_sqmi_', as.character(year), sep ='')) pct_water_col =as.symbol(paste('pct_water_area_sqmi_', as.character(year), sep =''))# set names of columns to be edited or removed median_age_est_col =as.symbol(paste('median_ageE_', as.character(year), sep ='')) median_income_est_col =as.symbol(paste('median_incomeE_', as.character(year), sep ='')) new_median_age_est_col =as.symbol(paste('median_age_', as.character(year), sep ='')) new_median_income_est_col =as.symbol(paste('median_income_', as.character(year), sep ='')) median_age_moe_col =as.symbol(paste('median_ageM_', as.character(year), sep ='')) median_income_moe_col =as.symbol(paste('median_incomeM_', as.character(year), sep =''))# iterate over each study county and get population, distressed, and zonal stats data,# then store resulting data frame in df object df <-for (i in southern_tier_counties) {print(i)# get county geography county_sf <- study_counties %>%filter(NAME == i)# get cropped & masked land cover raster for county county_lc_masked <-crop_raster(lc_file, county_sf)# calculate area of each land cover category in county county_zonal_stats <-calculate_zonal_stats(county_lc_masked) %>%summarize(!!developed_col :=filter(., land_cover_category ==1)$area_sqmi,!!natural_barren_col :=filter(., land_cover_category ==8)$area_sqmi,!!cropland_col :=filter(., land_cover_category ==2)$area_sqmi,!!tree_cover_col :=filter(., land_cover_category ==4)$area_sqmi,!!grassland_col :=filter(., land_cover_category ==3)$area_sqmi,!!wetland_col :=filter(., land_cover_category ==6)$area_sqmi,!!water_col :=filter(., land_cover_category ==5)$area_sqmi ) %>%mutate(county = i,geometry = county_sf$geometry) %>%st_as_sf(crs =st_crs(st_counties.2010_2019))# get county stats county_pop_data <-filter(st_counties.2010_2019, county == i) %>%select(county, contains(as.character(year)) &!contains('pct'))# calculate number of people in each county that lived in tracts that# were distressed and improved and in tracts that became distressed county_distressed <- st_tracts.distressed %>%filter(county == i) %>%select(GEOID, NAME, county, contains('Distressed'), contains('tot_popE'))# total number of people living in distress for this year pop_distressed <- county_distressed %>%filter(!!distressed_col_yr ==TRUE) %>%summarize(!!tot_pop_distressed_col :=sum(!!tot_pop_est_col, na.rm =TRUE) ) %>%mutate(county = i) %>%relocate(geometry, .after =last_col())# get number of people who lived in areas that became distressed by 2019 pop_became_distressed <- county_distressed %>%filter(if_Became_Distressed_2010_2019 ==TRUE) %>%summarize(tot_pop_became_distressed_2010_2019 =sum(!!tot_pop_est_col, na.rm =TRUE) ) %>%mutate(county = i) %>%relocate(geometry, .after =last_col())# get number of people who lived in areas that improved by 2019 pop_improved <- county_distressed %>%filter(if_Distressed_Then_Improved_2010_2019 ==TRUE) %>%summarize(tot_pop_improved_2010_2019 =sum(!!tot_pop_est_col, na.rm =TRUE) ) %>%mutate(county = i) %>%relocate(geometry, .after =last_col())# join distressed data-related data frames together county_distressed.data <-st_join(pop_distressed, pop_became_distressed) %>%st_join(pop_improved)# join 3 dataframes, then add to list of county data frames d <-st_join(county_pop_data, county_distressed.data, suffix =c('', '.y')) %>%st_join(county_zonal_stats, suffix =c('', '.z')) %>%select(-c('county.y...17', 'county.y...19', 'county.x', 'county.z', !!median_age_moe_col, !!median_income_moe_col)) %>%rename(!!new_median_age_est_col :=!!median_age_est_col,!!new_median_income_est_col :=!!median_income_est_col ) %>%mutate(!!pct_below_poverty_col := (!!tot_below_poverty_col /!!tot_pop_col) *100,!!pct_unemployed_col := (!!pop_unemployed_col /!!pop_civ_lab_force_col) *100,!!pct_pop_distressed_col := (!!pop_distressed_col /!!tot_pop_col) *100,!!pct_hhs_no_vehicles_col := (!!hhs_no_vehicles_col /!!tot_hhs_col) *100,!!pct_developed_col := (!!developed_col /!!area_col) *100,!!pct_natural_barren_col := (!!natural_barren_col /!!area_col) *100,!!pct_cropland_col := (!!cropland_col /!!area_col) *100,!!pct_tree_cover_col := (!!tree_cover_col /!!area_col) *100,!!pct_grassland_col := (!!grassland_col /!!area_col) *100,!!pct_wetland_col := (!!wetland_col /!!area_col) *100,!!pct_water_col := (!!water_col /!!area_col) *100 )# add data frame to county data list county_data_list[[index]] <- d# add 1 to index index = index +1 }# bind county data frames together, then return list data <-bind_rows(county_data_list)return(data)}
Analysis Processing
Add some text here
Code
# get full counties sfs for 2010 and 2019 and fill NA with 0, then join them into a single sffinal_counties_df.2010<-calculate_county_level_data(st_lc_2010_masked, 2010)
[1] "Allegany"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Broome"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Cattaraugus"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Chautauqua"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Chemung"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Chenango"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Cortland"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Delaware"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Otsego"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Schoharie"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Schuyler"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Steuben"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Tioga"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
[1] "Tompkins"
New names:
• `county.y` -> `county.y...17`
• `county.y` -> `county.y...19`
Changes in Population Characteristics, Economic Indicators, and Land Cover
By Southern Tier County (2010 - 2019)
% Pop Change
% HHs Change
% Median Age Change
% Median Income Change
% Pop Below Poverty Change
% Unemployment Rate Change
% Pop Distressed Change
% HHs No Vehicle Change
% Developed Change
% Natural Barren Change
% Cropland Change
% Tree Cover Change
% Grassland Change
% Wetland Change
% Water Change
Allegany
−4.9%
−5.5%
5.3%
18.9%
1.2%
−80.7%
1,000,000.0%
17.2%
−0.3%
9.8%
−0.1%
0.1%
−9.8%
−0.1%
−1.6%
Broome
−3.8%
−2.8%
0.0%
20.7%
12.4%
−19.7%
57.4%
4.6%
−0.0%
20.8%
−0.1%
−0.0%
−5.7%
0.0%
−0.2%
Cattaraugus
−4.5%
−2.7%
4.2%
21.7%
4.2%
0.0%
331.7%
32.2%
−0.1%
9.0%
0.0%
−0.2%
9.2%
−0.1%
−0.0%
Chautauqua
−5.0%
−3.6%
5.7%
22.8%
6.7%
−47.6%
61.0%
9.6%
−0.3%
33.1%
0.0%
−0.3%
1.5%
−0.1%
−0.0%
Chemung
−4.3%
−4.0%
2.0%
28.8%
−8.0%
−10.3%
20.5%
−0.4%
−0.2%
8.4%
0.2%
−0.1%
2.2%
−0.0%
−0.5%
Chenango
−6.1%
3.5%
6.1%
19.9%
−0.4%
−37.0%
1,000,000.0%
39.0%
0.4%
21.5%
−0.1%
0.1%
−7.3%
−0.0%
1.1%
Cortland
−3.2%
−0.9%
2.2%
23.3%
15.5%
−72.1%
1,000,000.0%
−22.9%
0.5%
22.7%
−0.1%
−0.1%
5.4%
−0.1%
0.8%
Delaware
−6.7%
−5.7%
7.3%
20.6%
21.6%
−36.5%
1,000,000.0%
21.2%
−0.4%
5.2%
−0.0%
0.1%
−8.8%
−0.0%
−0.7%
Otsego
−4.2%
−6.1%
6.2%
21.8%
−1.4%
−54.8%
0.0%
6.9%
−0.2%
43.2%
−0.1%
0.3%
−10.4%
0.0%
−0.3%
Schoharie
−4.8%
−3.3%
9.6%
14.3%
7.8%
−43.6%
0.0%
8.3%
−0.0%
22.2%
0.1%
−0.1%
6.9%
−0.0%
−3.2%
Schuyler
−3.5%
−2.1%
9.1%
30.3%
88.2%
−12.8%
0.0%
−23.6%
0.3%
40.8%
0.5%
−0.1%
−12.8%
−0.1%
−0.2%
Steuben
−2.3%
−1.7%
5.1%
25.0%
0.8%
−43.7%
268.3%
15.3%
0.1%
10.4%
0.1%
−0.0%
−2.2%
−0.0%
−0.6%
Tioga
−5.2%
−1.8%
6.4%
31.2%
5.8%
−91.2%
0.0%
−0.8%
0.2%
18.1%
−0.0%
−0.1%
1.1%
−0.1%
0.1%
Tompkins
2.0%
3.0%
5.0%
15.9%
0.3%
56.5%
−100.0%
9.1%
0.3%
11.6%
0.0%
0.1%
−6.6%
−0.1%
0.0%
Analysis Correlations
Add code and some text here
Code
# filter for only pct chg columnscounties.pct_chg <- final_counties_df %>%select(county, grep('chg', colnames(.), value =TRUE)) %>%st_drop_geometry()# calculate correlation coefficients between all variablescorrelations <-cor(select(counties.pct_chg, -c(county)))# display heatmap of correlationsheatmaply( correlations, colors =colorRampPalette(c('blue', 'white', 'orange'))(100), dendrogram ='none', limits =c(-1, 1), branches.lwd =0.5)
Add some text here
Code
# create data frame of correlation coefficients between # pct distressed pop chg and other variablescor.df <-melt(correlations) %>%filter(Var1 =='pct_pop_distressed_chg', Var2 !='pct_pop_distressed_chg') %>%select(-Var1) %>%rename(Variable = Var2,Coefficient = value ) %>%arrange(desc(Coefficient))# display table of correlation with pct distressed pop chg in descending order of correlationcor.df %>%gt() %>%tab_header(title =md('**Correlation with Percent Distressed Population Change**')) %>%cols_align(align ='center') %>%cols_label(Variable =md('**Variable**'),Coefficient =md('**Coefficient**') ) %>%fmt_number(columns = Coefficient, decimals =3)
Correlation with Percent Distressed Population Change
Variable
Coefficient
pct_tree_cover_chg
0.346
pct_hhs_no_vehicles_chg
0.202
pct_water_chg
0.166
pct_hhs_chg
0.064
pct_developed_chg
0.062
pct_median_age_chg
−0.012
pct_wetland_chg
−0.018
pct_below_poverty_chg
−0.044
pct_grassland_chg
−0.230
pct_median_income_chg
−0.242
pct_natural_barren_chg
−0.271
pct_unemployed_chg
−0.373
pct_pop_chg
−0.378
pct_cropland_chg
−0.446
Add some text here
Conclusions
[~200 words]
Clear summary adequately describing the results and putting them in context. Discussion of further questions and ways to continue investigation.
K C, S., Gyawali, B. R., Lucas, S., Antonious, G. F., Chiluwal, A., & Zourarakis, D. (2024). Assessing Land-Cover Change Trends, Patterns, and Transitions in Coalfield Counties of Eastern Kentucky, USA. Land, 13(9), 1541. https://doi.org/10.3390/land13091541
Distressed area methodology: https://www.arc.gov/distressed-areas-classification-system/
Primary Land Cover data: https://www.usgs.gov/special-topics/lcmap/collection-13-conus-science-products
2010 US data: Total HHs: https://www2.census.gov/library/publications/cen2010/briefs/c2010br-14.pdf (p. 5) Median Age: https://www.census.gov/newsroom/releases/archives/2010_census/cb11-cn144.html Median Family Income: https://www.huduser.gov/portal/datasets/il/il10/Medians2010.pdf Poverty Rate: https://www2.census.gov/library/publications/2012/acs/acsbr11-01.pdf